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(1) GENERAL INFORMATION 
(i) APPLICANT: Genencor International, Inc. 

(ii) TITLE OF THE INVENTION: ESTERASE ENZYMES, DNA ENCODING 
ESTERASE ENZYMES AND VECTORS AND HOST CELLS INCORPORATING SAME 

(iii) NUMBER OF SEQUENCES: 35 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Genencor International, Inc. 

(B) STREET: 925 Page Mill Road 

(C) CITY: Palo Alto 

(D) STATE: CA 

(E) COUNTRY: USA 

(F) ZIP : 94304-1013 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette 

(B) COMPUTER: IBM Compatible 

(C) OPERATING SYSTEM: DOS 

(D) SOFTWARE: FastSEQ for Windows Version 2.0 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 08/952,445 

(B) FILING DATE: 18-NOV-1997 

<vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 08/722,713 

(B) FILING DATE: 30-SEP-1996 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Stone, Christopher L. 

(B) REGISTRATION NUMBER: 35,6 96 

(C) REFERENCE /DOCKET NUMBER: GC362-2-US 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 650-846-7555 

(B) TELEFAX: 650-845-6504 



(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE : amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 



Ala Ser Thr Gin Gly He Ser Glu Asp Leu Tyr Ser Arg Leu Val Glu 
15 10 15 




Met Ala Thr lie Ser Gin Ala Ala Tyr Xaa Asp Leu Leu Asn lie Pro 
20 25 30 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

Xaa Thr Val Gly Phe Gly Pro Tyr 
1 5 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 

Phe Gly Leu His Leu Xaa Gin Xaa Met 
1 5 



(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: 

Xaa lie Ser Glu Asp Leu Tyr Ser 
1 5 



(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 



Tyr He Gly Trp Ser Phe Tyr Asn Ala 
1 5 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 

Gly He Ser Glu Asp Leu Tyr Xaa Xaa Gin 
15 10 



(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 

Xaa He Ser Glu Ser Leu Tyr Xaa Xaa Arg 
15 10 



(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 

Gly He Ser Glu Asp Leu Tyr 
1 5 



(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9 



Leu Glu Pro Pro Tyr Thr Gly 
1 5 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Xaa Ala Asn Asp Gly lie Pro Asn Leu Pro Pro Val Glu Gl; 
15 10 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Tyr Pro Asp Tyr Ala Leu Tyr Lys 
1 5 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
CGGGAATTCG CWSACCARGG AT 



(2) INFORMATION FOR SEQ ID NO:13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 





Ala 

1 
Met 



Ala Thr lie Ser Gin Ala Ala Tyr 
20 25 



Ser Thr Gin Gly lie Ser Glu Asp 
5 



Leu Tyr Ser Arg Leu Val Glu 

10 15 

Ala Asp Leu Leu Asn lie Pro 



30 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
CGGGAATTCT AYTAYATHGG TGGGT 25 
(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:15: 

Val His Gly Gly Tyr Tyr lie Gly Trp Val Ser Val Gin Asp Gin Val 
15 10 15 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:16: 
CGGGAATTCA CCCACCDATR TARTA 2 5 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 



Val His Gly Gly Tyr Tyr lie Gly Trp Val Ser Val Gin Asp Gin Val 




1 



5 



10 



15 



(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
CGGGAATTCT TGGATCCRTC RTT 



(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

Thr Asp Ala Phe Gin Ala Ser Ser Pro Asp Thr Thr Gin Tyr Phe Arg 

1 5 10 15 

Val Thr His Ala Asn Asp Gly lie Pro Asn Leu 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: 
CGGGAATTCA TCCRTCRTTG CRTG 



(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



20 



25 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
Thr Asp Ala Phe Gin Ala Ser Ser Pro Asp Thr Thr Gin Tyr Phe Arg 



15 10 
Val Thr His Ala Asn Asp Gly lie Pro Asn Leu 
20 25 



15 



(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 
CGGGAATTCG CYTGRAAGCR TCGTCAT 



(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: 



Met Thr Asp Ala Phe Gin Ala Ser 

1 5 
Arg Val Thr His Ala Asn Asp Gly 



Ser Pro Asp Thr Thr Gin Tyr Phe 

10 15 
lie Pro Asn Leu 
25 



(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 



Met Thr Asp Ala Phe Gin Ala Ser Ser Pro Asp Thr Thr Gin Tyr Phe 

15 10 15 

Arg Val Thr His Ala Asn Asp Gly lie Pro Asn Leu 
20 25 



(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 650 base pairs 

(B) TYPE: nucleic acid 



(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

GCCTCTACGC AGGGCATCTC CGAAGACCTC TACAGCCGTT TAGTCGAAAT GGCCACTATC 60 

TCCCAAGCTG CCTACGCCGA CCTGTGCAAC ATTCCGTCGA CTATTATCAA GGGAGAGAAA 12 0 

ATTTACAATT CTCAAACTGA CATTAACGGA TGGATCCTCC GCGACGACAG CAGCAAAGAA 180 

ATAATCACCG TCTTCCGTGG CACTGGTAGT GATACGAATC TACAACTCGA TACTAACTAC 240 

ACCCTCACGC CTTTCGACAC CCTACCACAA TGCAACGGTT GTGAAGTACA CGGTGGATAT 300 

TATATTGGAT GGGTCTCCGT CCAGGACCAA GTCGAGTCGC TTGTCAAACA GCAGGTTAGC 360 

CAGTATCCGG ACTATGCGCT GACTGTGACG GGCCACAGGT ATGCCCTCGT GATTTCTTTC 420 

AATTAAGTGT ATAATACTCA CTAACTCTAC GATAGTCTCG GAGCGTCCCT GGCAGCACTC 480 

ACTGCCGCCC AGCTGTCTGC GACATACGAC AACATCCGCC TGTACACCTT CGGCGAACCG 540 

CGCAGCGGCA ATCAGGCCTT CGCGTCGTAC ATGAACGATG CCTTCCAAGC CTCGAGCCCA 6 00 

GATACGACGC AGTATTTCCG GGTCACTCAT GCCAACGACG GCATCCCAAA 650 



n 



(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 197 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:26: 

Ala Ser Thr Gin Gly lie Ser Glu Asp Leu Tyr Ser Arg Leu Val Glu 

15 10 15 

Met Ala Thr lie Ser Gin Ala Ala Tyr Ala Asp Leu Cys Asn lie Pro 

20 25 30 

Ser Thr lie lie Lys Gly Glu Lys He Tyr Asn Ser Gin Thr Asp He 

35 40 45 

Asn Gly Trp He Leu Arg Asp Asp Ser Ser Lys Glu He He Thr Val 

50 55 60 

Phe Arg Gly Thr Gly Ser Asp Thr Asn Leu Gin Leu Asp Thr Asn Tyr 
65 70 75 80 

Thr Leu Thr Pro Phe Asp Thr Leu Pro Gin Cys Asn Gly Cys Glu Val 

85 90 95 

His Gly Gly Tyr Tyr He Gly Trp Val Ser Val Gin Asp Gin Val Glu 

100 105 110 

Ser Leu Val Lys Gin Gin Val Ser Gin Tyr Pro Asp Tyr Ala Leu Thr 

115 120 125 

Val Thr Gly His Ser Leu Gly Ala Ser Leu Ala Ala Leu Thr Ala Ala 

130 135 140 

Gin Leu Ser Ala Thr Tyr Asp Asn He Arg Leu Tyr Thr Phe Gly Glu 
145 150 155 160 

Pro Arg Ser Gly Asn Gin Ala Phe Ala Ser Tyr Met Asn Asp Ala Phe 

165 170 175 

Gin Ala Ser Ser Pro Asp Thr Thr Gin Tyr Phe Arg Val Thr His Ala 

180 185 190 

Asn Asp Gly He Pro 
195 




(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2436 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 





CCATGGTGGT 


GTCGATATCG 


GCAGTAGTCT 


TTGCCGAAAC 


GTTGAGGGTT 


ACAGTGATCT 


60 




GCGTCGGACA 


TACTTCGGGG 


AATCTACGGC 


GGAATATCAA 


AGTCTTCGGA 


ATATCCATAT 


12 0 




TGGGAAAGGA 


CAGAAGCTCC 


GGGGTAGTTT 


GATAGATGAG 


CTCCGGTGTA 


TTAAATCGGG 


180 




AGCTGACAGG 


AGTGAGCGTC 


ATGTAGACCA 


TCTAGTAATG 


TCAGTCGCGC 


GCAATTTCGC 


240 




ACATGAAACA 


AGTTGATTTC 


GGGACCCCAT 


TGTTACATCT 


CTCGGCTACA 


GCTCGAGATG 


300 




TGCCTGCCGA 


GTATACTTAG 


AAGCCATGCC 


AGCGTGTTGT 


TATACGACCA 


AAAGTCAGGG 


360 




AATATGAAAC 


GATCGTCGGA 


TATTTCTTGT 


TTTTATCCTA 


AATTAGTCTT 


CCAGTGGTTT 


420 




ATTTAAGAGA 


TAGATCCCTT 


CACAAACACT 


CATCCAACGG 


ACTTCTCATA 


CCACTCATTG 


480 




APATAATTTP 


AAACAGCTCC 


AGGCGCATTT 


AGTTCAACAT 


GAAGCAATTC 


TCCGCCAAAC 


540 


, Ft 


APGTPPTPGP 


AGTTGTGGTG 


ACTGCAGGGC 


ACGCCTTAGC 


AGCCTCTACG 


CAAGGCATCT 


600 


'0 
'ifar 


PPGAAGAPPT 


CTACAGCCGT 


TTAGTCGAAA 


TGGCCACTAT 


CTCCCAAGCT 


GCCTACGCCG 


660 


v 


APPTGTGPAA 


PATTCCGTCG 


ACTATTATCA 


AGGGAGAGAA 


AATTTACAAT 


TCTCAAACTG 


720 


j: asp 


ACATTAACGG 


ATGGATCCTC 


CGCGACGACA 


GCAGCAAAGA 


AATAATCACC 


GTCTTCCGTG 


780 




GPAPTGGTAG 


TGATACGAAT 


CTACAACTCG 


ATACTAACTA 


CACCCTCACG 


CCTTTCGACA 


840 


Is 


PPCTACCACA 


ATGCAACGGT 


TGTGAAGTAC 


ACGGTGGATA 


TTATATTGGA 


TGGGTCTCCG 


900 




TCPAGGACCA 


AGTCGAGTCG 


CTTGTCAAAC 


AGCAGGTTAG 


CCAGTATCCG 


GACTATGCG.C 


960 




TGACTGTGAC 


GGGCCACAGG 


TATGCCCTCG 


TGATTTCTTT 


CAATTAAGTG 


TATAATACTC 


1020 




ACTAACTCTA 


CGATAGTCTC 


GGAGCGTCCC 


TGGCAGCACT 


CACTGCCGCC 


CAGCTGTCTG 


1080 




CGACATACGA 


CAACATCCGC 


CTGTACACCT 


TCGGCGAACC 


GCGCAGCGGC 


AATCAGGCCT 


1140 


w 

j; isl; 


TCGCGTCGTA 


CATGAACGAT 


GCCTTCCAAG 


CCTCGAGCCC 


AGATACGACG 


CAGTATTTCC 


1200 


i fl 


GGGTCACTCA 


TGCCAACGAC 


GGCATCCCAA 


ACCTGCCCCC 


GGTGGAGCAG 


GGGTACGCCC 


1260 




ATGGCGGTGT 


AGAGTACTGG 


AGCGTTGATC 


CTTACAGCGC 


CCAGAACACA 


TTTGTCTGCA 


1320 




CTGGGGATGA 


AGTGCAGTGC 


TGTGAGGCCC 


AGGGCGGACA 


GGGTGTGAAT 


AATGCGCACA 


1380 




CGACTTATTT 


TGGGATGACG 


AGCGGAGCCT 


GTACATGGTG 


ATCAGTCATT 


TCAGCCTCCC 


1440 




CGAGTGTACC 


AGGAAAGATG 


GATGTCCTGG 


AGAGGGCATG 


CATGTACGTA 


TAC CCGAAGC 


1500 




ACACTTTTTC 


GGTAAATCAG 


GACATGTAAT 


AAGTTCCTTC 


CATGAATAGA 


TATGGTTACC 


1560 




CTCACCATAA 


GCCTTGAGGT 


TGCCTTTCTC 


TTTTGATTGT 


GAATATATAT 


TTAAAGTAGA 


1620 




TGACAGATAT 


CTCTAAACAC 


CTTATCCGCT 


TAAACCCATC 


ATAGATTGTG 


TCACGTGATA 


1680 




GACCCCTTGA 


ATGATGAGCG 


AAATGTATCA 


GTCCCGTTTA 


AATCAAACCC 


TTTCAGCCTA 


1740 




GCACAGTCAG 


AATAC AC CAA 


CCCCATTCTA 


AGGTAGTACT 


AAATATGAAT 


ACAGC CTAAA 


1800 




TGCATCGCTA 


TATGATCCCA 


TAAAGAAGCA 


ACAACCTTTC 


AGATCTCGTT 


TTGCGCTGCG 


1860 




AAGAGCTAGC 


TCTACCATGG 


TCTCAATTAT 


GAGTGGAGCG 


TTTAGTCTCG 


TTTAAGCCTA 


1920 




GCTATCTTAT 


AAGGACAACA 


CATGTACATG 


GGCTTACTTG 


TAGAGAGGTA 


GGATCCCGGG 


1980 




CTTCTTCACA 


TCTCGAGGAG 


TTGTCTACAC 


GTCGCGTCCA 


TGTCATAAGC 


CGGTACTCGA 


2040 




CGTTGTCGTG 


ACCGTGACCC 


AGACCCCTGT 


TGATAGCGTT 


GAGAAGGCCC 


TATATTTGAA 


2100 




TTTCCAATCT 


CAGCTTTACG 


AAGATATGCC 


CATGGTGGAG 


GGTTAGTAAA 


CCGATGATGA 


2160 




TCGTGTGCAG 


CATGAGATGA 


GACCGTGGCC 


AATCCTGTTC 


AAATGCCAAG 


ACCCGCCTCC 


2220 




TACCACATGT 


AAGGCATCCG 


TCGGCCGCAC 


GTTGAATTGT 


GCAAATGCCG 


AGATCATAAA 


2280 




AGCGGCCACA 


CTTCCACGTC 


GGTACTGGAT 


GGGTTGCGCG 


TGGCCATACT 


GTGTTTTCCA 


2340 




TTGCGTGGGT 


CGTTCGTGTT 


ACTGCGACGC 


AGATTCTGTA 


GGCAAGGCGC 


AGGGCTCTCT 


2400 




TCTGAGGTAG 


AAAACACCCC 


ATATTAATCT 


GAATTC 






2436 




(2) INFORMATION FOR SEQ 


ID NO: 28: 









(i) SEQUENCE CHARACTERISTICS: 




(A) LENGTH: 281 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 





Met 


Lys 


Gin 


Phe 


Ser 


Ala 


Lys 


His 


Val 


Leu 


Ala 


Val 


Val 


Val 


Thr 


Ala 




1 








5 










10 










15 






Gly 


His 


Ala 


Leu 


Ala 


Ala 


Ser 


Thr 


Gin 


Gly He 


Ser 


Glu 


Asp 


Leu 


Tyr 










20 










25 










30 








Ser 


Arg 


Leu 
35 


Val 


Glu 


Met 


Ala 


Thr 
40 


He 


Ser 


Gin 


Ala 


Ala 
45 


Tyr 


Ala 


Asp 




Leu 


Cys 


Asn 


He 


Pro 


Ser 


Thr 


He 


He 


Lys Gly Glu Lys 


He 


Tyr 


Asn 






50 










55 










60 












Ser 


Gin 


Thr 


Asp 


He 


Asn 


Gly 


Trp 


He 


Leu Arg Asp Asp 


Ser 


Ser 


Lys 




65 










70 










75 










80 




Glu 


He 


He 


Thr 


Val 


Phe 


Arg 


Gly 


Thr 


Gly Ser Asp Thr 


Asn 


Leu 


Gin 












85 










90 










95 




R 

y 
U 


Leu 


Asp 


Thr 


Asn 


Tyr 


Thr 


Leu 


Thr 


Pro 


Phe 


Asp 


Thr 


Leu 


Pro 


Gin 


Cys 








100 










105 










110 






Asn 


Gly 


Cys 


Glu 


Val 


His 


Gly 


Gly 


Tyr 


Tyr 


He 


Gly Trp 


Val 


Ser 


Val 


411; 

y 






115 










120 










125 








Gin 


Asp 


Gin 


Val 


Glu 


Ser 


Leu 


Val 


Lys 


Gin 


Gin 


Val 


Ser 


Gin 


Tyr 


Pro 


p: 




130 










135 










140 










i: 

3 


Asp 


Tyr 


Ala 


Leu 


Thr 


Val 


Thr 


Gly 


His 


Ser Leu Gly Ala 


Ser 


Leu 


Ala 




145 










150 










155 










160 




Ala 


Leu 


Thr 


Ala 


Ala 
165 


Gin 


Leu 


Ser 


Ala 


Thr 
170 


Tyr 


Asp 


Asn 


He 


Arg 
175 


Leu 




Tyr 


Thr 


Phe 


Gly 
180 


Glu 


Pro 


Arg 


Ser 


Gly 
185 


Asn 


Gin 


Ala 


Phe 


Ala 
190 


Ser 


Tyr 


fii 


Met 


Asn 


Asp 
195 


Ala 


Phe 


Gin 


Ala 


Ser 
200 


Ser 


Pro 


Asp 


Thr 


Thr 
205 


Gin 


Tyr 


Phe 




Arg 


Val 
210 


Thr 


His 


Ala 


Asn 


Asp 
215 


Gly 


He 


Pro 


Asn 


Leu 
220 


Pro 


Pro 


Val 


Glu 




Gin 


Gly 


Tyr 


Ala 


His 


Gly 


Gly 


Val 


Glu 


Tyr 


Trp 


Ser 


Val 


Asp 


Pro 


Tyr 




225 










230 










235 










240 




Ser 


Ala 


Gin 


Asn 


Thr 


Phe 


Val 


Cys 


Thr 


Gly Asp 


Glu 


Val 


Gin 


Cys 


Cys 












245 










250 










255 






Glu 


Ala 


Gin 


Gly 
260 


Gly 


Gin 


Gly 


Val 


Asn 
265 


Asn 


Ala 


His 


Thr 


Thr 
270 


Tyr 


Phe 




Gly 


Met 


Thr 


Ser 


Gly 


Ala 


Cys 


Thr 


Trp 

















275 280 



(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2436 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 



CCATGGTGGT GTCGATATCG GCAGTAGTCT TTGCCGAAAC GTTGAGGGTT ACAGTGATCT 60 

GCGTCGGACA TACTTCGGGG AATCTACGGC GGAATATCAA AGTCTTCGGA ATATCCATAT 120 

TGGGAAAGGA CAGAAGCTCC GGGGTAGTTT GATAGATGAG CTCCGGTGTA TTAAATCGGG 180 

AGCTGACAGG AGTGAGCGTC ATGTAGACCA TCTAGTAATG TCAGTCGCGC GCAATTTCGC 24 0 

ACATGAAACA AGTTGATTTC GGGACCCCAT TGTTACATCT CTCGGCTACA GCTCGAGATG 300 

TGCCTGCCGA GTATACTTAG AAGCCATGCC AGCGTGTTGT TATACGACCA AAAGTCAGGG 360 

AATATGAAAC GATCGTCGGA TATTTCTTGT TTTTATCCTA AATTAGTCTT CCAGTGGTTT 420 

ATTTAAGAGA TAGATCCCTT CACAAACACT CATCCAACGG ACTTCTCATA CCACTCATTG 480 

ACATAATTTC AAACAGCTCC AGGCGCATTT AGTTCAACAT GAAGCAATTC TCCGCCAAAC 540 

ACGTCCTCGC AGTTGTGGTG ACTGCAGGGC ACGCCTTAGC AGCCTCTACG CAAGGCATCT 600 

CCGAAGACCT CTACAGCCGT TTAGTCGAAA TGGCCACTAT CTCCCAAGCT GCCTACGCCG 660 

ACCTGTGCAA CATTCCGTCG ACTATTATCA AGGGAGAGAA AATTTACAAT TCTCAAACTG 720 

ACATTAACGG ATGGATCCTC CGCGACGACA GCAGCAAAGA AATAATCACC GTCTTCCGTG 780 

GCACTGGTAG TGATACGAAT CTACAACTCG ATACTAACTA CACCCTCACG CCTTTCGACA 84 0 

CCCTACCACA ATGCAACGGT TGTGAAGTAC ACGGTGGATA TTATATTGGA TGGGTCTCCG 900 

TCCAGGACCA AGTCGAGTCG CTTGTCAAAC AGCAGGTTAG CCAGTATCCG GACTATGCGC 960 

TGACTGTGAC GGGCCACAGG TATGCCCTCG TGATTTCTTT CAATTAAGTG TATAATACTC 1020 

ACTAACTCTA CGATAGTCTC GGAGCGTCCC TGGCAGCACT CACTGCCGCC CAGCTGTCTG 108 0 

CGACATACGA CAACATCCGC CTGTACACCT TCGGCGAACC GCGCAGCGGC AATCAGGCCT 114 0 

Q TCGCGTCGTA CATGAACGAT GCCTTCCAAG CCTCGAGCCC AGATACGACG CAGTATTTCC 1200 

>0 GGGTCACTCA TGCCAACGAC GGCATCCCAA ACCTGCCCCC GGTGGAGCAG GGGTACGCCC 1260 

i!a ATGGCGGTGT AGAGTACTGG AGCGTTGATC CTTACAGCGC CCAGAACACA TTTGTCTGCA 1320 

fQ CTGGGGATGA AGTGCAGTGC TGTGAGGCCC AGGGCGGACA GGGTGTGAAT AATGCGCACA 1380 

it CGACTTATTT TGGGATGACG AGCGGAGCCT GTACATGGTG ATCAGTCATT TCAGCCTCCC 1440 

!'c CGAGTGTACC AGGAAAGATG GATGTCCTGG AGAGGGCATG CATGTACGTA TACCCGAAGC 1500 

iTs ACACTTTTTC GGTAAATCAG GACATGTAAT AAGTTCCTTC CATGAATAGA TATGGTTACC 1560 

CTCACCATAA GCCTTGAGGT TGCCTTTCTC TTTTGATTGT GAATATATAT TTAAAGTAGA 162 0 

M TGACAGATAT CTCTAAACAC CTTATCCGCT TAAACCCATC ATAGATTGTG TCACGTGATA 1680 

i s GACCCCTTGA ATGATGAGCG AAATGTATCA GTCCCGTTTA AATCAAACCC TTTCAGCCTA 1740 

IZ GCACAGTCAG AATACACCAA CCCCATTCTA AGGTAGTACT AAATATGAAT ACAGCCTAAA 1800 

H TGCATCGCTA TATGATCCCA TAAAGAAGCA ACAACCTTTC AGATCTCGTT TTGCGCTGCG 1860 

It AAGAGCTAGC TCTACCATGG TCTCAATTAT GAGTGGAGCG TTTAGTCTCG TTTAAGCCTA 192 0 

j !p GCTATCTTAT AAGGACAACA CATGTACATG GGCTTACTTG TAGAGAGGTA GGATCCCGGG 1980 

tdf CTTCTTCACA TCTCGAGGAG TTGTCTACAC GTCGCGTCCA TGTCATAAGC CGGTACTCGA 2040 

M< CGTTGTCGTG ACCGTGACCC AGACCCCTGT TGATAGCGTT GAGAAGGCCC TATATTTGAA 2100 

TTTCCAATCT CAGCTTTACG AAGATATGCC CATGGTGGAG GGTTAGTAAA CCGATGATGA 2160 

TCGTGTGCAG CATGAGATGA GACCGTGGCC AATCCTGTTC AAATGCCAAG ACCCGCCTCC 2220 

TACCACATGT AAGGCATCCG TCGGCCGCAC GTTGAATTGT GCAAATGCCG AGATCATAAA 228 0 

AGCGGCCACA CTTCCACGTC GGTACTGGAT GGGTTGCGCG TGGCCATACT GTGTTTTCCA 2340 

TTGCGTGGGT CGTTCGTGTT ACTGCGACGC AGATTCTGTA GGCAAGGCGC AGGGCTCTCT 2400 

TCTGAGGTAG AAAACACCCC ATATTAATCT GAATTC 2436 

(2) INFORMATION FOR SEQ ID NO: 30: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 



GGCCTGCAGC CCCGCAAACT ACGGGTACGT CC 



32 



(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31 
CGCGCTGCAG GCTCTTTCTG GTAATACTAT GCTGG 



(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32 
GGCTTAATTA ACGTGCTGGT CTCGGATCTT TGGCGG 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33 
GGGGCGCGCC AGATCTAGTA CCGATGTTGA GGATGAAGCT 



(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34 
GCCCAGATCT CCGCAATGAA GCAATTCTCC GCCAAACAC 



(2) INFORMATION FOR SEQ ID NO: 35: 
(i) SEQUENCE CHARACTERISTICS: 



(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35 

AATAGTCGAC GGAATGTTGC ACAGG 
-- 27 

GC362-2-PCT 



